Finiding Gene Function using LitMiner

نویسندگان

  • Berry de Bruijn
  • Joel D. Martin
چکیده

NRC (National Research Council, Canada) submitted 2 sets of results for the primary task in the TREC Genome track. The systems that generated these results were tuned primarily to achieve very high recall (above 90%) and secondarily to minimize the number of documents retrieved. Both submitted sets were the outputs of automatic systems (non-interactive, non-supervised) with a modular architecture. The TREC evaluation confirmed that recall for both submissions was extremely high: 543 out of 566 target documents (0.9594) were returned. In addition, these systems returned far fewer documents than were allowed by the genomic track rules. They returned an average of 196 documents per query across the 50 queries, with a median value of only 100 documents. For the first submission, the system was entirely based on Information Retrieval techniques, tuned to achieve very high recall and fair precision. Averaged precision was 0.3941 for the first submission. This first submission ranked third out of 49 runs submitted by all participants. For the second submission, reranking was done based on the outcome of an information extraction module, tuned towards the task of identifying gene function papers. This module identified 539 documents as highly promising; 121 of these turned out to be target documents, 418 weren't. All in all this caused the averaged precision to drop slightly to 0.3771 contrary to our expectations. This second submission ranked fifth out of all 49 runs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LitMiner and WikiGene: identifying problem-related key players of gene regulation using publication abstracts

The LitMiner software is a literature data-mining tool that facilitates the identification of major gene regulation key players related to a user-defined field of interest in PubMed abstracts. The prediction of gene-regulatory relationships is based on co-occurrence analysis of key terms within the abstracts. LitMiner predicts relationships between key terms from the biomedical domain in four c...

متن کامل

LitMiner: integration of library services within a bio-informatics application

BACKGROUND This paper examines how the adoption of a subject-specific library service has changed the way in which its users interact with a digital library. The LitMiner text-analysis application was developed to enable biologists to explore gene relationships in the published literature. The application features a suite of interfaces that enable users to search PubMed as well as local databas...

متن کامل

Optimization of Gene Expression Programming Model using Wavelet Transform for Simulating Long-term Rainfall in Anzali City

Due to drought and climate change, estimation and prediction of rainfall is quite important in various areas all over the world. In this study, a novel artificial intelligence (AI) technique (WGEP) was developed to model long-term rainfall (67 years period) in Anzali city for the first time. This model was combined using Wavelet Transform (WT) and Gene Expression Programming (GEP) model. Firstl...

متن کامل

Impact of Exercise Endurance Training on PurB Gene Expression and Cardiac Function

Introduction: Endurance training has significant effects on the renewal of heart tissue, including myosin heavy chain (MHC) proteins. On the other side, Purine-rich element-binding protein &beta (purB) decreases the &alphaMHC gene expression. The aim of this study was to determine the impact of exercise endurance training on purB gene expression in the heart of Wistar rats. Methods: Fourteen r...

متن کامل

Ectopic Expression of Embryo/Cancer Sequence A (ECSA) in KYSE-30 Cell Line Using Retroviral System

Background Human preimplantation embryonic cells share many similarities with cancer cells such as ability to self-renew, unlimited proliferation and maintenance of the undifferentiated state. Embryo-cancer sequence A (ECSA), also known as developmental pluripotency associated-2 (DPPA2), is a cancer testis antigen (CTA) with unclear biological function yet. Objective: CTAs are expressed normal...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003